Simple Example: Two Random uniform Distributions

To demonstrate effectiveness of ICA on decomposing source signals and to develop a better undering, a very simple toy example is simulated. The“unknown” source components in this model are just two random uniform variables in the matrix \(S\), distorted by some arbitary “unknown”

Here it is a 2x2 matrix \(A\)

\[A = \begin{bmatrix} 1 & 1 \\ -1 & 3 \end{bmatrix}\] .

The known observations is the product of these two matrices, or matrix \(X\).

To test ICA, apply FastICA to this simple model and generate 5 plots:

\(\bullet\) The pre-processed data

\(\bullet\) PCA components (for comparison)

\(\bullet\) ICA components

    - fastICA
    
    - JADE method
    
    - Infomax 
set.seed(10)
library(fastICA)
library(JADE)
library(ica)
library(gridExtra)
library(tidyverse)
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## combine(): dplyr, gridExtra
## filter():  dplyr, stats
## lag():     dplyr, stats
S <- matrix(runif(10000), 1000, 2)
A <- matrix(c(1, 1, -1, 3), 2, 2, byrow = TRUE)
X <- S %*% A

a <- fastICA(X, 2, alg.typ = "parallel", fun = "logcosh", alpha = 1, 
             method = "C", row.norm = FALSE, maxit = 200, 
             tol = 0.0001, verbose = TRUE)
## Centering
## Whitening
## Symmetric FastICA using logcosh approx. to neg-entropy function
## Iteration 1 tol=0.000001
b <- icajade(X,2,center=TRUE,maxit=200,tol=0.0001)

c <- icaimax(X,nc=2,center=TRUE,maxit=200,tol=0.0001,alg="newton",fun="log")
  
par(mfrow = c(1, 2))
plot(a$X, main = "Pre-processed data")
plot(a$X %*% a$K, main = "PCA components")

par(mfrow = c(1, 3))

plot(a$S, main = "ICA components fastICA")
plot(b$S,main="ICA Jade")
plot(c$S, main="ICA Infomax")

Plot Interpretation

Looking at these plots, we see that the preprocessed data is rotated and no signals are clear from it.

PCA appear to have flattened the data slightly and creates almost a mirror of the original data, but the true signals from the distributions still do not appear completely evident. PCA does not effectively separate sources given how it rotates data.

However, looking at the ICA transformation, we can which clearly see the shape of a uniform distribution forming after the decomposition. ICA was able to recover the original signals quite easily in this simple simulation. The results for all 3 ICA packages (JADE,Infomax and fastICA) appear to produce approximately the same plots for this example.

(More) Complex Example: The Cocktail Party Problem

The most common example used to explain and motive the use of ICA is the “Cocktail Party Problem.”

Two microphones are placed in a room where to different people are speaking at the same time. Each microphone signal is defined as \(x_1(t)\) and \(x_2(t)\), with some observed (i.e. known) amplitudes \(x_1\) and \(x_2\) with time as an index. The unknown speech signals by the speakers are \(s_1(t)\) and \(s_2(t)\) to make up the following system of equations:

\[ x_1(t) = a_{11}s_1 +a_{12}s_2 \]

\[ x_2(t) = a_{21}s_1 +a_{22}s_2 \]

This can also be rewritten as:

\[x_i = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}\] \[s_i = \begin{bmatrix} s_1 \\ s_2 \end{bmatrix}\] \[A = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}\]

If we knew the weighted \(a_{ij}\), this problem could be easily solved but since we do not, we must use blind source separation.

Example with Cocktail Party Data

As a test example/simulation, we can use the CPPdata from the JADE package that contains 50,000 observations from 4 different microphones.

## Centering
## Whitening
## Symmetric FastICA using logcosh approx. to neg-entropy function
## Iteration 1 tol=0.938170
## Iteration 2 tol=0.830189
## Iteration 3 tol=0.560629
## Iteration 4 tol=0.050488
## Iteration 5 tol=0.002132
## Iteration 6 tol=0.000091
## Warning in icajade(dat, 4, center = TRUE, maxit = 200, tol = 1e-04): Numerical rank of X is less than requested number of components (nc).
##   Number of components has been redefined as the numerical rank of X.
## Warning in icaimax(dat, nc = 4, center = TRUE, maxit = 200, tol = 1e-04, : Numerical rank of X is less than requested number of components (nc).
##   Number of components has been redefined as the numerical rank of X.

Plot Interpretation

Again, we have a highly rotated set of pre-processed data and a PCA rotation that only transforms the data enough to separate the mixture ever so slightly.

With ICA, we see a full rotation and a much cleaner shape beginning to form, almost diamond-like. Like with the random uniform example, the superior performance of any one of the three ICA methods is not identifible just from these primitive plots of the model.

Of note with the ICA algorithms example is the “nc” (number of components) input of the function for each of these different methods. Particularly, both the JADE and Infomax produce similar plots at \(nc=4\), which makes sense as the dataset \(CPPdata\) has observations from 4 microphones. However, to generate a similar plot for fastICA, \(nc=2\) was used. The results for \(nc=4\) with ICA appear as below (JADE and Infomax shown at \(nc=2\) for reverse comparison:

set.seed(10)
library(fastICA)
S <- matrix(runif(10000), 1000, 2)
A <- matrix(c(1, 1, -1, 3), 2, 2, byrow = TRUE)
X <- S %*% A
m <- fastICA(dat,4,alg.typ = "parallel", fun = "logcosh", alpha = 1, 
             method = "C", row.norm = FALSE, maxit = 200, 
             tol = 0.0001, verbose = TRUE)
## Centering
## Whitening
## Symmetric FastICA using logcosh approx. to neg-entropy function
## Iteration 1 tol=0.787709
## Iteration 2 tol=0.598775
## Iteration 3 tol=0.300622
## Iteration 4 tol=0.115911
## Iteration 5 tol=0.035327
## Iteration 6 tol=0.009270
## Iteration 7 tol=0.002285
## Iteration 8 tol=0.000553
## Iteration 9 tol=0.000162
## Iteration 10 tol=0.000077
n <- icajade(dat,2,center=TRUE,maxit=200,tol=0.0001)

o <- icaimax(dat,nc=2,center=TRUE,maxit=200,tol=0.0001,alg="newton",fun="log")

par(mfrow = c(1,3))
plot(m$S, main = "ICA components")
plot(n$S, main = "ICA JADE")
plot(o$S, main ="ICA Infomax")

While JADE and Infomax at \(nc=2\) do not differ greatly from \(nc=4\), the fastICA plot appears more similar to the pre-processed data when the number of components is set at 4.

Explain what this could mean

Simulation

Given the simplistic nature of both toy examples, a simulation can provide more robust insight into the effectiveness of ICA in extracting original source components.

For this, we will use the Monte Carlo Markov Chain (MCMC) and see how ICA performs in a “noisier” setting.

Random Uniform Distribution Sim

N <- 10000
sim_S <- function(N,noise) {
  U <- runif(N, 0,1)
  samples <- rep(0,N)
  X <- rexp(N,1)
  S <- runif(10000)
  for(i in 1:N) {
    if(U[i]<noise) {
      samples[i] = X[i]

    }else {
      samples[i] = S[i]}

  }
    m.samples <- matrix(samples,N,2,byrow=T)
  return(m.samples)
  
}

x1 <- sim_S(N,.05)
x2 <- sim_S(N,.10)
x3 <- sim_S(N, .15)

A <- matrix(c(1, 1, -1, 3), 2, 2, byrow = TRUE)
X <- x1 %*% A

a <- fastICA(X, 2, alg.typ = "parallel", fun = "logcosh", alpha = 1, 
             method = "C", row.norm = FALSE, maxit = 200, 
             tol = 0.0001, verbose = TRUE)
## Centering
## Whitening
## Symmetric FastICA using logcosh approx. to neg-entropy function
## Iteration 1 tol=0.000215
## Iteration 2 tol=0.000320
## Iteration 3 tol=0.000501
## Iteration 4 tol=0.000832
## Iteration 5 tol=0.001479
## Iteration 6 tol=0.002815
## Iteration 7 tol=0.005437
## Iteration 8 tol=0.007751
## Iteration 9 tol=0.001121
## Iteration 10 tol=0.000002
b <- icajade(X,2,center=TRUE,maxit=200,tol=0.0001)

c <- icaimax(X,nc=2,center=TRUE,maxit=200,tol=0.0001,alg="newton",fun="log")

par(mfrow = c(1, 2))
plot(a$X, main = "Pre-processed data")
plot(a$X %*% a$K, main = "PCA components")

par(mfrow = c(1, 3))

plot(a$S, main = "ICA components fastICA")
plot(b$S,main="ICA Jade")
plot(c$S, main="ICA Infomax")

A <- matrix(c(1, 1, -1, 3), 2, 2, byrow = TRUE)
X <- S %*% A

a <- fastICA(X, 2, alg.typ = "parallel", fun = "logcosh", alpha = 1, 
             method = "C", row.norm = FALSE, maxit = 200, 
             tol = 0.0001, verbose = TRUE)
## Centering
## Whitening
## Symmetric FastICA using logcosh approx. to neg-entropy function
## Iteration 1 tol=0.131085
## Iteration 2 tol=0.000336
## Iteration 3 tol=0.000004
b <- icajade(X,2,center=TRUE,maxit=200,tol=0.0001)

c <- icaimax(X,nc=2,center=TRUE,maxit=200,tol=0.0001,alg="newton",fun="log")

par(mfrow = c(1, 2))
plot(a$X, main = "Pre-processed data")
plot(a$X %*% a$K, main = "PCA components")

par(mfrow = c(1, 3))

plot(a$S, main = "ICA components fastICA")
plot(b$S,main="ICA Jade")
plot(c$S, main="ICA Infomax")

CPP Simulation:

Adding noise from Random Uniform

What is the measure of comparison?

Does well here/not well

m <- fastICA(r1,2,alg.typ = "parallel", fun = "logcosh", alpha = 1, 
             method = "C", row.norm = FALSE, maxit = 200, 
             tol = 0.0001, verbose = TRUE)
## Centering
## Whitening
## Symmetric FastICA using logcosh approx. to neg-entropy function
## Iteration 1 tol=0.006395
## Iteration 2 tol=0.000040
n <- icajade(r1,2,center=TRUE,maxit=200,tol=0.0001)

o <- icaimax(r1,nc=2,center=TRUE,maxit=200,tol=0.0001,alg="newton",fun="log")

par(mfrow = c(1,2))
plot(m$X, main = "Pre-processed data")
plot(m$X %*% m$K, main = "PCA components")

par(mfrow = c(1,3))

plot(m$S, main = "ICA components")
plot(n$S, main = "ICA JADE")
plot(o$S, main ="ICA Infomax")

m <- fastICA(r2,2,alg.typ = "parallel", fun = "logcosh", alpha = 1, 
             method = "C", row.norm = FALSE, maxit = 200, 
             tol = 0.0001, verbose = TRUE)
## Centering
## Whitening
## Symmetric FastICA using logcosh approx. to neg-entropy function
## Iteration 1 tol=0.022903
## Iteration 2 tol=0.002771
## Iteration 3 tol=0.000168
## Iteration 4 tol=0.000008
n <- icajade(r2,2,center=TRUE,maxit=200,tol=0.0001)

o <- icaimax(r2,nc=2,center=TRUE,maxit=200,tol=0.0001,alg="newton",fun="log")

par(mfrow = c(1,2))
plot(m$X, main = "Pre-processed data")
plot(m$X %*% m$K, main = "PCA components")

par(mfrow = c(1,3))

plot(m$S, main = "ICA components")
plot(n$S, main = "ICA JADE")
plot(o$S, main ="ICA Infomax")

m <- fastICA(r3,2,alg.typ = "parallel", fun = "logcosh", alpha = 1, 
             method = "C", row.norm = FALSE, maxit = 200, 
             tol = 0.0001, verbose = TRUE)
## Centering
## Whitening
## Symmetric FastICA using logcosh approx. to neg-entropy function
## Iteration 1 tol=0.050045
## Iteration 2 tol=0.014569
## Iteration 3 tol=0.000013
n <- icajade(r3,2,center=TRUE,maxit=200,tol=0.0001)

o <- icaimax(r3,nc=2,center=TRUE,maxit=200,tol=0.0001,alg="newton",fun="log")

par(mfrow = c(1,2))
plot(m$X, main = "Pre-processed data")
plot(m$X %*% m$K, main = "PCA components")

par(mfrow = c(1,3))

plot(m$S, main = "ICA components")
plot(n$S, main = "ICA JADE")
plot(o$S, main ="ICA Infomax")